details widget name

Summary Overview

Same cat chapters
  • Technical documentation: Summarization
Chapter details

 

Introduction

Automatic summary generation, as summarization in general, is focused on the creation of a shortened version of a text, without losing the general meaning. The difference is that the automatic summarization aims at generating a meaningful summary without human intervention. Typically there are two basic steps in summarization: first, the most semantically important sentences should be located and extracted; and second, these sentences should be combined in a logically connected text, without disruptions. Based on these two steps, there are two basic types of automatic summarization: extractive and abstractive.

Extractive summary

Extractive summary generation is focused on recognizing the most important sentences and extracting them from the text, without paraphrasing them in a coherent text. This approach is the less complex one, since the algorithms need to find the semantically important sentences.

Abstractive summary

On the other hand, abstractive automatic summarization deals with the restating and gluing the extracted sentences into coherent summarized text. The reason this approach is significantly more complex is because it involves natural language generation, which encompasses lots of NLP research areas, like: anaphora resolution, discourse analysis, named entity recognition, sentence boundary disambiguation, entailment, and others. Many of these sub-research areas are utilized in the extractive summarization, in order to provide fine-tuned algorithms for sentence selection.

ATLAS approach

ATLAS uses two approaches for summarization of short and long texts respectively.

The first approach, suitable for shorter texts, generates summaries as a sequence of discourse clauses extracted from the original text, after obtaining the discourse structure of the text and exploiting cohesion and coherence properties.

The second approach estimates the importance of each sentence in a text and outputs the most important sentences as summary. This method provides better results with longer texts.

For more details refer to Deliverable D5.1 – “Text summarization tools”.